Querying Versioned Software Repositories

نویسندگان

  • Dietrich Christopeit
  • Michael H. Böhlen
  • Carl-Christian Kanne
  • Arturas Mazeika
چکیده

Large parts of today’s data is stored in text documents that undergo a series of changes during their lifetime. For instance during the development of a software product the source code changes frequently. Currently, managing such data relies on version control systems (VCSs). Extracting information from large documents and their different versions is a manual and tedious process. We present Qvestor, a system that allows to declaratively query documents. It leverages information about the structure of a document that is available as a context-free grammar and allows to declaratively query document versions through a grammar annotated with relational algebra expressions. We define and illustrate the annotation of grammars with relational algebra expressions and show how to translate the annotations to easy to use SQL views. DOI: https://doi.org/10.1007/978-3-642-23737-9_4 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-56409 Accepted Version Originally published at: Christopeit, Dietrich; Böhlen, Michael H; Kanne, Carl-Christian; Mazeika, Arturas (2011). Querying versioned software repositories. In: 15th international conference on Advances in databases and information systems , Vienna, Austria, 20 September 2011 23 September 2011, 42-55. DOI: https://doi.org/10.1007/978-3-642-23737-9_4 Querying Versioned Software Repositories Dietrich Christopeit, Michael Böhlen, Carl-Christian Kanne, Arturas Mazeika [email protected], [email protected], [email protected], [email protected] Abstract. Large parts of today’s data is stored in text documents that undergo a series of changes during their lifetime. For instance during the development of a software product the source code changes frequently. Currently, managing such data relies on version control systems (VCSs). Extracting information from large documents and their different versions is a manual and tedious process. We present Qvestor, a system that allows to declaratively query documents. It leverages information about the structure of a document that is available as a context-free grammar and allows to declaratively query document versions through a grammar annotated with relational algebra expressions. We define and illustrate the annotation of grammars with relational algebra expressions and show how to translate the annotations to easy to use SQL views. Large parts of today’s data is stored in text documents that undergo a series of changes during their lifetime. For instance during the development of a software product the source code changes frequently. Currently, managing such data relies on version control systems (VCSs). Extracting information from large documents and their different versions is a manual and tedious process. We present Qvestor, a system that allows to declaratively query documents. It leverages information about the structure of a document that is available as a context-free grammar and allows to declaratively query document versions through a grammar annotated with relational algebra expressions. We define and illustrate the annotation of grammars with relational algebra expressions and show how to translate the annotations to easy to use SQL views.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interacting with local and remote data repositories using the stashR package

The stashR package (a Set of Tools for Administering Shared Repositories) for R implements a basic versioned key-value style database where character string keys are associated with data values. Using the S4 classes ‘localDB’ and ‘remoteDB’, and associated methods, versioned key-value databases can be either created locally on the user’s computer or accessed remotely via the Internet. The stash...

متن کامل

A logic foundation for a general-purpose history querying tool

Version control systems (VCS) have become indispensable software development tools. The version snapshots they store to provide support for change coordination and release management, effectively track the evolution of the versioned software and its development process. Despite this wealth of historical information, it has only been leveraged by tools that are dedicated to a specific task such ...

متن کامل

Indexing Highly Repetitive Collections

The need to index and search huge highly repetitive sequence collections is rapidly arising in various fields, including computational biology, software repositories, versioned collections, and others. In this short survey we briefly describe the progress made along three research lines to address the problem: compressed suffix arrays, grammar compressed indexes, and Lempel-Ziv compressed indexes.

متن کامل

A Comparison of Top-k Temporal Keyword Querying over Versioned Text Collections

As the web evolves over time, the amount of versioned text collections increases rapidly. Most web search engines will answer a query by ranking all known documents at the (current) time the query is posed. There are applications however (for example customer behavior analysis, crime investigation, etc.) that would need to efficiently query these sources as of some past time, that is, retrieve ...

متن کامل

Integrating software engineering tools and repositories with XML and XSLT

Interoperability between heterogeneous repositories and applications is often needed in Internet-based software development. At present XML is increasingly being used to integrate repositories and to express data fetched from various sources, but mismatches are encountered between the schemas of different repositories. XSLT is typically used to stylize results, but this does not utilize the ful...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011